pygments: add typing annotation for missing base files of pygments #14832

lev-blit · 2025-10-04T21:05:42Z

No description provided.

github-actions · 2025-10-05T20:50:17Z

Diff from mypy_primer, showing the effect of this PR on open source code:

pwndbg (https://github.com/pwndbg/pwndbg)
+ pwndbg/integration/binja.py: note: In member "decompile" of class "BinjaProvider":
+ pwndbg/integration/binja.py:503: error: No overload variant of "format" matches argument types "list[tuple[Any, str]]", "Terminal256Formatter[str]"  [call-overload]
+ pwndbg/integration/binja.py:503: note: Possible overload variants:
+ pwndbg/integration/binja.py:503: note:     def [_T: (str, bytes)] format(tokens: Iterator[tuple[_TokenType, str]], formatter: Formatter[_T], outfile: SupportsWrite[_T]) -> None
+ pwndbg/integration/binja.py:503: note:     def [_T: (str, bytes)] format(tokens: Iterator[tuple[_TokenType, str]], formatter: Formatter[_T], outfile: None = ...) -> _T
+ pwndbg/integration/binja.py:509: error: No overload variant of "format" matches argument types "list[tuple[Any, str]]", "Terminal256Formatter[str]"  [call-overload]
+ pwndbg/integration/binja.py:509: note: Possible overload variants:
+ pwndbg/integration/binja.py:509: note:     def [_T: (str, bytes)] format(tokens: Iterator[tuple[_TokenType, str]], formatter: Formatter[_T], outfile: SupportsWrite[_T]) -> None
+ pwndbg/integration/binja.py:509: note:     def [_T: (str, bytes)] format(tokens: Iterator[tuple[_TokenType, str]], formatter: Formatter[_T], outfile: None = ...) -> _T

sphinx (https://github.com/sphinx-doc/sphinx)
+ sphinx/highlighting.py:39: error: Unused "type: ignore" comment  [unused-ignore]
+ sphinx/highlighting.py:39:5: error: Cannot assign to a method  [method-assign]
+ sphinx/highlighting.py:39:5: note: Error code "method-assign" not covered by "type: ignore" comment
+ sphinx/highlighting.py:39:35: error: Incompatible types in assignment (expression has type "classmethod[Any, [Any], Any]", variable has type "Callable[[Any], GenericAlias]")  [assignment]
+ sphinx/highlighting.py:39:35: note: Error code "assignment" not covered by "type: ignore" comment

alectryon (https://github.com/cpitclaudel/alectryon)
+ alectryon/pygments_lexer.py:456: error: Dict entry 3 has incompatible type "str": "list[tuple[str | Any, ...]]"; expected "str": "list[tuple[str | words, _TokenType | Iterator[tuple[int, _TokenType, str]] | Callable[[Lexer, _PseudoMatch, LexerContext], Iterator[tuple[int, _TokenType, str]]]] | tuple[str | words, _TokenType | Iterator[tuple[int, _TokenType, str]] | Callable[[Lexer, _PseudoMatch, LexerContext], Iterator[tuple[int, _TokenType, str]]], str] | include | default]"  [dict-item]
+ alectryon/pygments_lexer.py:457: error: Dict entry 4 has incompatible type "str": "list[tuple[str | Any, ...]]"; expected "str": "list[tuple[str | words, _TokenType | Iterator[tuple[int, _TokenType, str]] | Callable[[Lexer, _PseudoMatch, LexerContext], Iterator[tuple[int, _TokenType, str]]]] | tuple[str | words, _TokenType | Iterator[tuple[int, _TokenType, str]] | Callable[[Lexer, _PseudoMatch, LexerContext], Iterator[tuple[int, _TokenType, str]]], str] | include | default]"  [dict-item]
+ alectryon/pygments_lexer.py:477: error: Dict entry 7 has incompatible type "str": "list[object]"; expected "str": "list[tuple[str | words, _TokenType | Iterator[tuple[int, _TokenType, str]] | Callable[[Lexer, _PseudoMatch, LexerContext], Iterator[tuple[int, _TokenType, str]]]] | tuple[str | words, _TokenType | Iterator[tuple[int, _TokenType, str]] | Callable[[Lexer, _PseudoMatch, LexerContext], Iterator[tuple[int, _TokenType, str]]], str] | include | default]"  [dict-item]
+ alectryon/pygments_lexer.py:490: error: Dict entry 9 has incompatible type "str": "list[object]"; expected "str": "list[tuple[str | words, _TokenType | Iterator[tuple[int, _TokenType, str]] | Callable[[Lexer, _PseudoMatch, LexerContext], Iterator[tuple[int, _TokenType, str]]]] | tuple[str | words, _TokenType | Iterator[tuple[int, _TokenType, str]] | Callable[[Lexer, _PseudoMatch, LexerContext], Iterator[tuple[int, _TokenType, str]]], str] | include | default]"  [dict-item]
+ alectryon/pygments_lexer.py:491: error: Dict entry 10 has incompatible type "str": "list[object]"; expected "str": "list[tuple[str | words, _TokenType | Iterator[tuple[int, _TokenType, str]] | Callable[[Lexer, _PseudoMatch, LexerContext], Iterator[tuple[int, _TokenType, str]]]] | tuple[str | words, _TokenType | Iterator[tuple[int, _TokenType, str]] | Callable[[Lexer, _PseudoMatch, LexerContext], Iterator[tuple[int, _TokenType, str]]], str] | include | default]"  [dict-item]
+ alectryon/pygments_lexer.py:492: error: Dict entry 11 has incompatible type "str": "list[object]"; expected "str": "list[tuple[str | words, _TokenType | Iterator[tuple[int, _TokenType, str]] | Callable[[Lexer, _PseudoMatch, LexerContext], Iterator[tuple[int, _TokenType, str]]]] | tuple[str | words, _TokenType | Iterator[tuple[int, _TokenType, str]] | Callable[[Lexer, _PseudoMatch, LexerContext], Iterator[tuple[int, _TokenType, str]]], str] | include | default]"  [dict-item]

srittau

Not a full review, just a few things I have spotted.

I haven't looked the output of mypy_primer, but the amount of new typing errors in projects depending on pygments is something we need to review.

srittau · 2025-10-07T09:43:18Z

stubs/Pygments/pygments/filters/__init__.pyi

-def find_filter_class(filtername): ...
-def get_filter_by_name(filtername, **options): ...
+def find_filter_class(filtername: str) -> type[Filter]: ...
+def get_filter_by_name(filtername: str, **options: Any) -> Filter: ...


Also applies to the other **options: Any annotations in this PR: In general, when using Any we need a comment, explaining the possible types or reason for using it. In this case probably something like this:

Suggested change

def get_filter_by_name(filtername: str, **options: Any) -> Filter: ...

# options are forwarded to the filter's constructor

def get_filter_by_name(filtername: str, **options: Any) -> Filter: ...

See https://typing.python.org/en/latest/guides/writing_stubs.html#using-any-and-object for details.

srittau · 2025-10-07T09:44:13Z

stubs/Pygments/pygments/formatters/__init__.pyi

-def get_formatter_by_name(_alias, **options): ...
-def load_formatter_from_file(filename, formattername: str = "CustomFormatter", **options): ...
-def get_formatter_for_filename(fn, **options): ...
+def find_formatter_class(alias: str) -> type[Formatter[Any]]: ...


In this case, Formatter[Any] doesn't need an explanation, as it's obvious why it's used.

srittau · 2025-10-07T09:50:22Z

stubs/Pygments/pygments/lexer.pyi

+        tokendefs: (
+            dict[
+                str,
+                list[
+                    tuple[str, _TokenType | Iterator[tuple[int, _TokenType, str]]]
+                    | tuple[str, _TokenType | Iterator[tuple[int, _TokenType, str]], str]
+                ],
+            ]
+            | None
+        ) = None,


I haven't looked at this in detail, but this could be problematic. Due to the invariance of list, this will reject calls like this:

x.process_tokendef("", {"foo": [("", "")]})

In this case, the dict could be inferred to have type dict[str, list[tuple[str, str]]], but list[tuple[str, str]] is incompatible with list[tuple[str, str, str]].

Same below.

srittau · 2025-10-07T09:55:22Z

stubs/Pygments/pygments/util.pyi

+def guess_decode_from_terminal(text: bytes, term: TextIO | Any) -> tuple[bytes, str]: ...
+def terminal_encoding(term: TextIO | Any) -> str: ...


Pseudo-protocols like TextIO should be avoided. In this case, we could use a protocol _HasEncoding or similar, but I think the better approach is to just use Any with a comment:

Suggested change

def guess_decode_from_terminal(text: bytes, term: TextIO | Any) -> tuple[bytes, str]: ...

def terminal_encoding(term: TextIO | Any) -> str: ...

# In the following two methods, `term` are file-like objects that optionally have an

# `encoding` attribute.

def guess_decode_from_terminal(text: bytes, term: Any) -> tuple[bytes, str]: ...

def terminal_encoding(term: Any) -> str: ...

This would be another good use case for python/typing#601.

lev-blit force-pushed the feature/pygments-fill-typing branch from ec07b24 to b3485bd Compare October 4, 2025 21:05

This comment has been minimized.

Sign in to view

lev-blit added 3 commits October 5, 2025 22:57

pygments: add typing annotation for missing base files of pygments

390628e

add filters + formatters __init__.pyi

45b0765

more fixes

657943f

lev-blit force-pushed the feature/pygments-fill-typing branch from c670fec to 657943f Compare October 5, 2025 19:57

This comment has been minimized.

Sign in to view

allow words and not only str, and allow default instead of whole thing

77ad3a6

lev-blit marked this pull request as ready for review October 5, 2025 20:53

srittau reviewed Oct 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

pygments: add typing annotation for missing base files of pygments #14832

pygments: add typing annotation for missing base files of pygments #14832

Uh oh!

lev-blit commented Oct 4, 2025

Uh oh!

This comment has been minimized.

This comment has been minimized.

github-actions bot commented Oct 5, 2025

Uh oh!

srittau left a comment •

edited

Loading

Uh oh!

srittau Oct 7, 2025

Uh oh!

srittau Oct 7, 2025

Uh oh!

srittau Oct 7, 2025

Uh oh!

srittau Oct 7, 2025

Uh oh!

Uh oh!

	def get_filter_by_name(filtername: str, **options: Any) -> Filter: ...
	# options are forwarded to the filter's constructor
	def get_filter_by_name(filtername: str, **options: Any) -> Filter: ...

		def guess_decode_from_terminal(text: bytes, term: TextIO \| Any) -> tuple[bytes, str]: ...
		def terminal_encoding(term: TextIO \| Any) -> str: ...

-def guess_decode_from_terminal(text: bytes, term: TextIO | Any) -> tuple[bytes, str]: ...
-def terminal_encoding(term: TextIO | Any) -> str: ...
+# In the following two methods, `term` are file-like objects that optionally have an
+# `encoding` attribute.
+def guess_decode_from_terminal(text: bytes, term: Any) -> tuple[bytes, str]: ...
+def terminal_encoding(term: Any) -> str: ...

Uh oh!

pygments: add typing annotation for missing base files of pygments #14832

Are you sure you want to change the base?

pygments: add typing annotation for missing base files of pygments #14832

Uh oh!

Conversation

lev-blit commented Oct 4, 2025

Uh oh!

This comment has been minimized.

This comment has been minimized.

github-actions bot commented Oct 5, 2025

Uh oh!

srittau left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

srittau Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

srittau Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

srittau Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

srittau Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

srittau left a comment •

edited

Loading